{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "30eb1704-8d76-4bc9-9308-93243aeb69cb",
   "metadata": {},
   "source": [
    "## This demo app shows:\n",
    "* How to use LlamaIndex, an open source library to help you build custom data augmented LLM applications\n",
    "* How to ask Llama questions about recent live data via the You.com live search API and LlamaIndex\n",
    "\n",
    "The LangChain package is used to facilitate the call to Llama2 hosted on OctoAI\n",
    "\n",
    "**Note** We will be using OctoAI to run the examples here. You will need to first sign into [OctoAI](https://octoai.cloud/) with your Github or Google account, then create a free API token [here](https://octo.ai/docs/getting-started/how-to-create-an-octoai-access-token) that you can use for a while (a month or $10 in OctoAI credits, whichever one runs out first).\n",
    "After the free trial ends, you will need to enter billing info to continue to use Llama2 hosted on OctoAI."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "68cf076e",
   "metadata": {},
   "source": [
    "We start by installing the necessary packages:\n",
    "- [langchain](https://python.langchain.com/docs/get_started/introduction) which provides RAG capabilities\n",
    "- [llama-index](https://docs.llamaindex.ai/en/stable/) for data augmentation."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "1d0005d6-e928-4d1a-981b-534a40e19e56",
   "metadata": {},
   "outputs": [],
   "source": [
    "!pip install llama-index langchain"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "21fe3849",
   "metadata": {},
   "outputs": [],
   "source": [
    "# use ServiceContext to configure the LLM used and the custom embeddings\n",
    "from llama_index import ServiceContext\n",
    "\n",
    "# VectorStoreIndex is used to index custom data \n",
    "from llama_index import VectorStoreIndex\n",
    "\n",
    "from langchain.llms.octoai_endpoint import OctoAIEndpoint"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "73e8e661",
   "metadata": {},
   "source": [
    "Next we set up the OctoAI token."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d9d76e33",
   "metadata": {},
   "outputs": [],
   "source": [
    "from getpass import getpass\n",
    "import os\n",
    "\n",
    "OCTOAI_API_TOKEN = getpass()\n",
    "os.environ[\"OCTOAI_API_TOKEN\"] = OCTOAI_API_TOKEN"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f8ff812b",
   "metadata": {},
   "source": [
    "In this example we will use the [YOU.com](https://you.com/) search engine to augment the LLM's responses.\n",
    "To use the You.com Search API, you can email api@you.com to request an API key. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "75275628-5235-4b55-8033-601c76107528",
   "metadata": {},
   "outputs": [],
   "source": [
    "YOUCOM_API_KEY = getpass()\n",
    "os.environ[\"YOUCOM_API_KEY\"] = YOUCOM_API_KEY"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "cb210c7c",
   "metadata": {},
   "source": [
    "We then call the Llama 2 model from OctoAI.\n",
    "\n",
    "We will use the Llama 2 13b chat FP16 model. You can find more on Llama 2 models on the [OctoAI text generation solution page](https://octoai.cloud/tools/text).\n",
    "\n",
    "At the time of writing this notebook the following Llama models are available on OctoAI:\n",
    "* llama-2-13b-chat\n",
    "* llama-2-70b-chat\n",
    "* codellama-7b-instruct\n",
    "* codellama-13b-instruct\n",
    "* codellama-34b-instruct\n",
    "* codellama-70b-instruct"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "c12fc2cb",
   "metadata": {},
   "outputs": [],
   "source": [
    "# set llm to be using Llama2 hosted on OctoAI\n",
    "llama2_13b = \"llama-2-13b-chat-fp16\"\n",
    "\n",
    "llm = OctoAIEndpoint(\n",
    "    endpoint_url=\"https://text.octoai.run/v1/chat/completions\",\n",
    "    model_kwargs={\n",
    "        \"model\": llama2_13b,\n",
    "        \"messages\": [\n",
    "            {\n",
    "                \"role\": \"system\",\n",
    "                \"content\": \"You are a helpful, respectful and honest assistant.\"\n",
    "            }\n",
    "        ],\n",
    "        \"max_tokens\": 500,\n",
    "        \"top_p\": 1,\n",
    "        \"temperature\": 0.01\n",
    "    },\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "476d72da",
   "metadata": {},
   "source": [
    "Using our api key we set up earlier, we make a request from YOU.com for live data on a particular topic."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "effc9656-b18d-4d24-a80b-6066564a838b",
   "metadata": {},
   "outputs": [],
   "source": [
    "import requests\n",
    "\n",
    "query = \"Meta Connect\" # you can try other live data query about sports score, stock market and weather info \n",
    "headers = {\"X-API-Key\": os.environ[\"YOUCOM_API_KEY\"]}\n",
    "data = requests.get(\n",
    "    f\"https://api.ydc-index.io/search?query={query}\",\n",
    "    headers=headers,\n",
    ").json()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "8bed3baf-742e-473c-ada1-4459012a8a2c",
   "metadata": {},
   "outputs": [],
   "source": [
    "# check the query result in JSON\n",
    "import json\n",
    "\n",
    "print(json.dumps(data, indent=2))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b196e697",
   "metadata": {},
   "source": [
    "We then use the [`JSONLoader`](https://llamahub.ai/l/file-json) to extract the text from the returned data. The `JSONLoader` gives us the ability to load the data into LamaIndex.\n",
    "In the next cell we show how to load the JSON result with key info stored as \"snippets\".\n",
    "\n",
    "However, you can also add the snippets in the query result to documents like below:\n",
    "```python \n",
    "from llama_index import Document\n",
    "snippets = [snippet for hit in data[\"hits\"] for snippet in hit[\"snippets\"]]\n",
    "documents = [Document(text=s) for s in snippets]\n",
    "```\n",
    "This can be handy if you just need to add a list of text strings to doc"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "7c40e73f-ca13-4f4a-a753-e613df3d389e",
   "metadata": {},
   "outputs": [],
   "source": [
    "# one way to load the JSON result with key info stored as \"snippets\"\n",
    "from llama_index import download_loader\n",
    "\n",
    "JsonDataReader = download_loader(\"JsonDataReader\")\n",
    "loader = JsonDataReader()\n",
    "documents = loader.load_data([hit[\"snippets\"] for hit in data[\"hits\"]])\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8e5e3b4e",
   "metadata": {},
   "source": [
    "With the data set up, we create a vector store for the data and a query engine for it.\n",
    "\n",
    "For our embeddings we will use `OctoAIEmbeddings` whose default embedding model is GTE-Large. This model provides a good balance between speed and performance.\n",
    "\n",
    "For more info see https://octoai.cloud/tools/text/embeddings?mode=demo&model=thenlper%2Fgte-large. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a5de3080-2c4b-479c-baba-793b3bee36ed",
   "metadata": {},
   "outputs": [],
   "source": [
    "# use OctoAI embeddings \n",
    "from langchain_community.embeddings import OctoAIEmbeddings\n",
    "from llama_index.embeddings import LangchainEmbedding\n",
    "\n",
    "\n",
    "embeddings = LangchainEmbedding(OctoAIEmbeddings(\n",
    "    endpoint_url=\"https://text.octoai.run/v1/embeddings\"\n",
    "))\n",
    "print(embeddings)\n",
    "\n",
    "# create a ServiceContext instance to use Llama2 and custom embeddings\n",
    "service_context = ServiceContext.from_defaults(llm=llm, chunk_size=800, chunk_overlap=20, embed_model=embeddings)\n",
    "\n",
    "# create vector store index from the documents created above\n",
    "index = VectorStoreIndex.from_documents(documents, service_context=service_context)\n",
    "\n",
    "# create query engine from the index\n",
    "query_engine = index.as_query_engine(streaming=False)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2c4ea012",
   "metadata": {},
   "source": [
    "We are now ready to ask Llama 2 a question about the live data using our query engine."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "de91a191-d0f2-498e-88dc-b2b43423e0e5",
   "metadata": {},
   "outputs": [],
   "source": [
    "# ask Llama2 a summary question about the search result\n",
    "response = query_engine.query(\"give me a summary\")\n",
    "print(str(response))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "72814b20-06aa-4da8-b4dd-f0b0d74a2ea0",
   "metadata": {},
   "outputs": [],
   "source": [
    "# more questions\n",
    "print(str(query_engine.query(\"what products were announced\")))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a65bc037-a689-476d-b529-0059a27bc949",
   "metadata": {},
   "outputs": [],
   "source": [
    "print(str(query_engine.query(\"tell me more about Meta AI assistant\")))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "16a56542",
   "metadata": {},
   "outputs": [],
   "source": [
    "print(str(query_engine.query(\"what are Generative AI stickers\")))"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.6"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
